{ "cells": [ { "cell_type": "markdown", "id": "c34f0dfb", "metadata": {}, "source": [ "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/pyinat/pyinaturalist/main?filepath=examples%2FData%2520Visualizations%2520-%2520Regional%2520Activity%2520Report.ipynb)" ] }, { "cell_type": "markdown", "id": "attached-priority", "metadata": {}, "source": [ "# Regional activity time series visualizations\n", "This example shows how to create visualizations of iNaturalist activity over time in a given region.\n", "See https://www.inaturalist.org/places to find place IDs.\n", "\n", "Visualization are made using [Altair](https://altair-viz.github.io), with the following metrics:\n", "* Number of observations\n", "* Number of taxa observed\n", "* Number of observers\n", "* Number of identifiers" ] }, { "cell_type": "code", "execution_count": null, "id": "prostate-overall", "metadata": {}, "outputs": [], "source": [ "from datetime import datetime\n", "\n", "import altair as alt\n", "import pandas as pd\n", "\n", "from pyinaturalist import (\n", " get_interval_ranges,\n", " iNatClient,\n", ")\n", "\n", "# Create a client for API requests\n", "client = iNatClient()\n", "\n", "# Adjustable values\n", "PLACE_ID = 6\n", "PLACE_NAME = 'Alaska'\n", "YEAR = 2020" ] }, { "cell_type": "markdown", "id": "angry-longer", "metadata": {}, "source": [ "### Observations per year\n![observations_by_year.png](images/observations_by_year.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "joint-interference", "metadata": {}, "outputs": [], "source": [ "observations_by_year = client.observations.histogram(\n", " place_id=PLACE_ID,\n", " interval='year',\n", " d1='2008-01-01',\n", " d2=f'{YEAR}-12-31',\n", " verifiable=True,\n", ")\n", "observations_by_year_df = pd.DataFrame(\n", " [{'date': k, 'observations': v} for k, v in observations_by_year.raw.items()]\n", ")\n", "\n", "alt.Chart(observations_by_year_df).mark_bar().encode(x='year(date):T', y='observations:Q')" ] }, { "cell_type": "markdown", "id": "invisible-needle", "metadata": {}, "source": [ "### Observations per month\n![observations_by_month.png](images/observations_by_month.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "dietary-tours", "metadata": {}, "outputs": [], "source": [ "observations_by_month = client.observations.histogram(\n", " place_id=PLACE_ID,\n", " interval='month',\n", " d1='2020-01-02',\n", " d2='2020-12-31',\n", " verifiable=True,\n", ")\n", "observations_by_month_df = pd.DataFrame(\n", " [\n", " {'metric': 'Observations', 'date': k, 'count': v}\n", " for k, v in observations_by_month.raw.items()\n", " ]\n", ")\n", "alt.Chart(observations_by_month_df).mark_bar().encode(x='month(date):T', y='count:Q')" ] }, { "cell_type": "markdown", "id": "genetic-camping", "metadata": {}, "source": [ "### Histograms with custom metrics\n", "The API does not have a histogram endpoint for taxa observed, observers, or identifiers,\n", "so we first need to determine our date ranges of interest, and then run one search per date range.\n", "\n", "Here are a couple helper functions to make this easier:" ] }, { "cell_type": "code", "execution_count": null, "id": "duplicate-attribute", "metadata": {}, "outputs": [], "source": [ "def count_date_range_results(function_name, start_date, end_date):\n", " \"\"\"Get the count of results for the given date range and controller method\"\"\"\n", " # Running this search with per_page=0 will (quickly) return only a count of results, not complete results\n", " controller = getattr(client.observations, function_name)\n", " paginator = controller(\n", " place_id=PLACE_ID,\n", " d1=start_date,\n", " d2=end_date,\n", " verifiable=True,\n", " )\n", " count = paginator.count()\n", " print(f'Total results for {start_date.strftime(\"%b\")}: {count}')\n", " return count\n", "\n", "\n", "def get_monthly_counts(function_name, label):\n", " \"\"\"Get the count of results per month for the given controller method\"\"\"\n", " month_ranges = get_interval_ranges(datetime(YEAR, 1, 1), datetime(YEAR, 12, 31), 'month')\n", " counts_by_month = {\n", " start_date: count_date_range_results(function_name, start_date, end_date)\n", " for (start_date, end_date) in month_ranges\n", " }\n", " return pd.DataFrame(\n", " [{'metric': label, 'date': k, 'count': v} for k, v in counts_by_month.items()]\n", " )" ] }, { "cell_type": "markdown", "id": "induced-stone", "metadata": {}, "source": [ "### Unique taxa observed per month\n![taxa_by_month.png](images/taxa_by_month.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "exempt-victor", "metadata": {}, "outputs": [], "source": [ "taxa_by_month = get_monthly_counts('species_counts', 'Taxa')\n", "alt.Chart(taxa_by_month).mark_bar().encode(x='month(date):T', y='count:Q')" ] }, { "cell_type": "markdown", "id": "brief-daniel", "metadata": {}, "source": [ "### Observers per month\n![observers_by_month.png](images/observers_by_month.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "generous-candy", "metadata": {}, "outputs": [], "source": [ "observers_by_month = get_monthly_counts('observers', 'Observers')\n", "alt.Chart(observers_by_month).mark_bar().encode(x='month(date):T', y='count:Q')" ] }, { "cell_type": "markdown", "id": "earlier-warren", "metadata": {}, "source": [ "### Identifiers per month\n![identifiers_by_month.png](images/identifiers_by_month.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "parliamentary-edward", "metadata": {}, "outputs": [], "source": [ "identifiers_by_month = get_monthly_counts('identifiers', 'Identifiers')\n", "alt.Chart(identifiers_by_month).mark_bar().encode(x='month(date):T', y='count:Q')" ] }, { "cell_type": "markdown", "id": "another-ambassador", "metadata": {}, "source": [ "### Combine all monthly metrics into one plot\n![combined_activity_stats.png](images/combined_activity_stats.png)" ] }, { "cell_type": "code", "execution_count": null, "id": "serious-literacy", "metadata": { "tags": [] }, "outputs": [], "source": [ "combined_results = pd.concat(\n", " [observations_by_month_df, taxa_by_month, observers_by_month, identifiers_by_month]\n", ")\n", "\n", "alt.Chart(\n", " combined_results,\n", " title=f'iNaturalist activity in {PLACE_NAME} ({YEAR})',\n", " width=750,\n", " height=500,\n", ").mark_line().encode(\n", " alt.X('month(date):T', axis=alt.Axis(title='Month')),\n", " alt.Y('count:Q', axis=alt.Axis(title='Count')),\n", " color='metric',\n", " strokeDash='metric',\n", ").configure_axis(\n", " labelFontSize=15,\n", " titleFontSize=20,\n", ")" ] } ], "metadata": { "celltoolbar": "Tags", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.6" } }, "nbformat": 4, "nbformat_minor": 5 }